The SRI System for the NIST OpenSAD 2015 Speech Activity Detection Evaluation

نویسندگان

  • Martin Graciarena
  • Luciana Ferrer
  • Vikramjit Mitra
چکیده

In this paper, we present the SRI system submission to the NIST OpenSAD 2015 speech activity detection (SAD) evaluation. We present results on three different development databases that we created from the provided data. We present system-development results for feature normalization; for feature fusion with acoustic, voicing, and channel bottleneck features; and finally for SAD bottleneck-feature fusion. We present a novel technique called test adaptive calibration, which is designed to improve decision-threshold selection for each test waveform. We present unsupervised test adaptation of the fusion component and describe its tight synergy to the test adaptive calibration component. Finally, we present results on the evaluation test data and show how the proposed techniques lead to significant gains on channels unseen during training.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HAPPY Team Entry to NIST OpenSAD Challenge: A Fusion of Short-Term Unsupervised and Segment i-Vector Based Speech Activity Detectors

Speech activity detection (SAD), the task of locating speech segments from a given recording, remains challenging under acoustically degraded conditions. In 2015, National Institute of Standards and Technology (NIST) coordinated OpenSAD bench-mark. We summarize “HAPPY” team effort to OpenSAD. SADs come in both unsupervised and supervised flavors, the latter requiring a labeled training set. Our...

متن کامل

I-Vectors for Speech Activity Detection

I-Vectors are low dimensional front-end features known to effectively preserve the total variability of the signal. Motivated by their successful use for several classification problems such as speaker, language and face recognition, this paper introduces i-vectors for the task of speech activity detection (SAD). In contrast to most state-of-the-art SAD methods that operate at the frame or segm...

متن کامل

The SRI/OGI 2006 spoken term detection system

This paper describes the system developed jointly at SRI and OGI for participation in the 2006 NIST Spoken Term Detection (STD) evaluation. We participated in the three genres of the English track: Broadcast News (BN), Conversational Telephone Speech (CTS), and Conference Meetings (MTG). The system consists of two phases. First, audio indexing, an offline phase, converts the input speech wavefo...

متن کامل

A noise-robust system for NIST 2012 speaker recognition evaluation

The National Institute of Standards and Technology (NIST) 2012 speaker recognition evaluation posed several new challenges including noisy data, varying test-sample length and number of enrollment samples, and a new metric. Target speakers were known during system development and could be used for model training and score normalization. For the evaluation, SRI International (SRI) submitted a sy...

متن کامل

Study of Overlapped Speech Detection for NIST SRE Summed Channel Speaker Recognition

This paper studies the overlapped speech detection for improving the performance of the summed channel speaker recognition system in NIST Speaker Recognition Evaluation (SRE). The speaker recognition system includes four main modules: voice activity detection, speaker diarization, overlapped speaker detection and speaker recognition. We adopt a GMM based overlapped speaker detection system, by ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016